Credal Classification for Mining Environmental Data
نویسنده
چکیده
Classifiers that aim at doing reliable predictions should rely on carefully elicited prior knowledge. Often this is not available so they should be able to start learning from data in condition of prior ignorance. This paper shows empirically, on an agricultural data set, that established methods of classification do not always adhere to this principle. Common ways to represent prior ignorance are shown to have an overwhelming weight compared to the information in the data, producing overconfident predictions. This point is crucial for problems, such as environmental ones, where prior knowledge is often scarce and even the data may not be known precisely. Credal classification, and in particular the naive credal classifier, are proposed as more faithful ways to represent states of ignorance. We show that with credal classification, conditions of ignorance may limit the power of the inferences, not the reliability of the predictions.
منابع مشابه
Tree-Based Credal Networks for Classification
Bayesian networks are models for uncertain reasoning which are achieving a growing importance also for the data mining task of classification. Credal networks extend Bayesian nets to sets of distributions, or credal sets. This paper extends a state-of-the-art Bayesian net for classification, called tree-augmented naive Bayes classifier, to credal sets originated from probability intervals. This...
متن کاملCredible classification for environmental problems
Classifiers that aim at doing credible predictions should rely on carefully elicited prior knowledge. Often this is not available so they should start learning from data in condition of near-ignorance. This paper shows empirically, on an agricultural data set, that established methods of classification do not always adhere to this principle. Traditional ways to represent prior ignorance are sho...
متن کاملComments on "Imprecise probability models for learning multinomial distributions from data. Applications to learning credal networks" by Andrés R. Masegosa and Serafín Moral
We briefly overview the problem of learning probabilities from data using imprecise probability models that express very weak prior beliefs. Then we comment on the new contributions to this question given in the paper by Masegosa and Moral and provide some insights about the performance of their models in data mining experiments of classification.
متن کاملJNCC2: An extension of naive Bayes classifier suited for small and incomplete data sets
JNCC2 implements the Naive Credal Classifier 2 (NCC2), i.e., an extension of naive Bayes to imprecise probabilities, designed to return robust classification even on small and/or incomplete data sets, which is often the case in environmental case studies.
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002